Statistical Analysis and Parameter Selection for Mapper

نویسندگان

  • Mathieu Carrière
  • Bertrand Michel
  • Steve Oudot
چکیده

In this article, we study the question of the statistical convergence of the 1-dimensional Mapper to its continuous analogue, the Reeb graph. We show that the Mapper is an optimal estimator of the Reeb graph, which gives, as a byproduct, a method to automatically tune its parameters and compute confidence regions on its topological features, such as its loops and flares. This allows to circumvent the issue of testing a large grid of parameters and keeping the most stable ones in the brute-force setting, which is widely used in visualization, clustering and feature selection with the Mapper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spatial Design for Knot Selection in Knot-Based Low-Rank Models

‎Analysis of large geostatistical data sets‎, ‎usually‎, ‎entail the expensive matrix computations‎. ‎This problem creates challenges in implementing statistical inferences of traditional Bayesian models‎. ‎In addition,researchers often face with multiple spatial data sets with complex spatial dependence structures that their analysis is difficult‎. ‎This is a problem for MCMC sampling algorith...

متن کامل

Sequence analysis Optimal seed solver: optimizing seed selection in read mapping

Motivation: Optimizing seed selection is an important problem in read mapping. The number of non-overlapping seeds a mapper selects determines the sensitivity of the mapper while the total frequency of all selected seeds determines the speed of the mapper. Modern seed-and-extend mappers usually select seeds with either an equal and fixed-length scheme or with an inflexible placement scheme, bot...

متن کامل

On Model-Based Clustering, Classification, and Discriminant Analysis

The use of mixture models for clustering and classification has burgeoned into an important subfield of multivariate analysis. These approaches have been around for a half-century or so, with significant activity in the area over the past decade. The primary focus of this paper is to review work in model-based clustering, classification, and discriminant analysis, with particular attenti...

متن کامل

Application of Three Parameter Interval Grey Numbers in Enterprise Resource Planning Selection

This paper applies a new multi attribute decision-making (MADM) model to help companies for enterprise resource planning (ERP) selection problem based on Balanced Score Card method. This paper uses three-parameter interval grey numbers which is derived from Grey theory (was proposed by J. Deng). This numbers is used instead of linguistic variables. Beside, a new weighting method that outcomes f...

متن کامل

Determination of the Size of a Trial, Using Lindley’s Method

Extended Abstract. When a new treatment is being considered, trials are carried out to estimate the increase in performance which is likely to result if the new treatment were to replace the treatment in current use. Many authors have looked at this problem and many procedures have been introduced to solve it. An important feature of the analysis in this work is that account is taken of the fac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1706.00204  شماره 

صفحات  -

تاریخ انتشار 2017